2015 IEEE International Conference on Big Data (Big Data)

chapter

Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news

Ismini Lourentzou, Graham Dyer, Abhishek Sharma, ChengXiang Zhai

2015 IEEE International Conference on Big Data (Big Data) > 2948 - 2950

We propose and study a novel problem of mining news text and social media jointly to discover controversial points in news, which enables many applications such as highlighting controversial points in news articles for readers, revealing controversies in news and their trends over time, and quantifying the controversy of a news source. We design a controversy scoring function to discover the most...

chapter

Improving the quality of semantic relationships extracted from massive user behavioral data

Khalifeh AlJadda, Mohammed Korayem, Trey Grainger

2015 IEEE International Conference on Big Data (Big Data) > 2951 - 2953

2015 IEEE International Conference on Big Data (Big Data)

As the ability to store and process massive amounts of user behavioral data increases, new approaches continue to arise for leveraging the wisdom of the crowds to gain insights that were previously very challenging to discover by text mining alone. For example, through collaborative filtering, we can learn previously hidden relationships between items based upon users' interactions with them, and...

chapter

Integrating semantic knowledge into Tag-LDA model through cloud model

Maoyuan Zhang, Fang Yuan, Jianping Zhu

2015 IEEE International Conference on Big Data (Big Data) > 2907 - 2909

2015 IEEE International Conference on Big Data (Big Data)

Semantic Knowledge is usually adding into topic model to improve topic coherence. However, it's hard to judge whether semantic information is related to topic without using complicated lexical characteristics. In this paper, we demonstrate a novel model called Cloud Transformation Model, which can easily judge whether semantic information is related to topic, and integrate semantic information into...

chapter

Using probabilistic approach to joint clustering and statistical inference: Analytics for big investment data

Hua Fang, Honggang Wang, Chonggang Wang, Mahmoud Daneshmand

2015 IEEE International Conference on Big Data (Big Data) > 2916 - 2918

2015 IEEE International Conference on Big Data (Big Data)

This paper proposes a Contrarian Probabilistic Model (CPM) to evaluate the effectiveness of contrarians' investment in preferred stocks using big data from Tradeline. CPM accommodates the unique features of investment data which are often correlated, nested, heterogeneous, non-normal with missing values. The clustering and statistical inference are integrated in CPM, which enables joint investment...

chapter

Using Word2Vec to process big text data

Long Ma, Yanqing Zhang

2015 IEEE International Conference on Big Data (Big Data) > 2895 - 2897

2015 IEEE International Conference on Big Data (Big Data)

Big data is a broad data set that has been used in many fields. To process huge data set is a time consuming work, not only due to its big volume of data size, but also because data type and structure can be different and complex. Currently, many data mining and machine learning technique are being applied to deal with big data problem; some of them can construct a good learning algorithm in terms...

chapter

Inferring bike trip patterns from bike sharing system open data

Longbiao Chen, Jeremie Jakubowicz

2015 IEEE International Conference on Big Data (Big Data) > 2898 - 2900

2015 IEEE International Conference on Big Data (Big Data)

Understanding bike trip patterns in a bike sharing system is important for researchers designing models for station placement and bike scheduling. By bike trip patterns, we refer to the large number of bike trips observed between two stations. However, due to privacy and operational concerns, bike trip data are usually not made publicly available. In this paper, instead of relying on time-consuming...

chapter

Finding banded patterns in big data using sampling

Fatimah B Abdullahi, Frans Coenen, Russell Martin

2015 IEEE International Conference on Big Data (Big Data) > 2233 - 2242

2015 IEEE International Conference on Big Data (Big Data)

A mechanism for identifying bandings in large "zero-one" N-dimensional data sets, using a sampling technique, is presented. The challenge of identifying bandings in data is the large number of potential permutations that need to be considered. To circumvent this a banding score mechanism is proposed that avoids the need to consider large numbers of permutations. This has been incorporated...

chapter

Scalable preference queries for high-dimensional data using map-reduce

Gheorghi Guzun, Joel E. Tosado, Guadalupe Canahuate

2015 IEEE International Conference on Big Data (Big Data) > 2243 - 2252

2015 IEEE International Conference on Big Data (Big Data)

Preference (top-k) queries play a key role in modern data analytics tasks. Top-k techniques rely on ranking functions in order to determine an overall score for each of the objects across all the relevant attributes being examined. This ranking function is provided by the user at query time, or generated for a particular user by a personalized search engine which prevents the pre-computation of the...

chapter

Towards a taxonomy of standards in smart data

Alexander Lenk, Leif Bonorden, Astrid Hellmanns, Nico Roedder, more

2015 IEEE International Conference on Big Data (Big Data) > 1749 - 1754

2015 IEEE International Conference on Big Data (Big Data)

The usage of large amounts of data has an immense potential for global economic growth and the competitiveness of countries with high technological standards. Vast amounts of data from different sources are collected and analyzed in order to seek economic profit and competitive advantages for companies and society in general. To gain profit from such data, it needs to be analyzed, processed, and interpreted...

chapter

Mixed-initiative social media analytics at the World Bank: Observations of citizen sentiment in Twitter data to explore "trust" of political actors and state institutions and its relationship to social protest

Nadya A. Calderon, Brian Fisher, Jeff Hemsley, Billy Ceskavich, more

2015 IEEE International Conference on Big Data (Big Data) > 1678 - 1687

2015 IEEE International Conference on Big Data (Big Data)

This paper discusses a project that studied the relationship between citizen trust and social protest using visual analysis of approximately 11 million sentiment classified Tweets from the period of the 2014 Brazilian World Cup. The results of the study reveal that the 2014 World Cup protests in Brazil sprang from a wide range of grievances coupled with a relative sense of deprivation compared with...

chapter

Data deidentification in medical transcriptions using regular expressions and machine learning

Joshua Seeger, Aron Culotta, Jason Keller, Patrick van Kessel, more

2015 IEEE International Conference on Big Data (Big Data) > 1322 - 1323

2015 IEEE International Conference on Big Data (Big Data)

A system is developed to redact personally identifiable information (PII) through a combination of entity recognition, regular expressions, and machine learning with very high precision from millions of medical transcriptions. This system is trained and tested with manually redacted medical transcriptions using an internally developed coding system, providing double blind classification capabilities.

chapter

High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm

Jeyhun Karimov, Murat Ozbayoglu

2015 IEEE International Conference on Big Data (Big Data) > 1473 - 1478

2015 IEEE International Conference on Big Data (Big Data)

Achieving high quality clustering is one of the most well-known problems in data mining. k-means is by far the most commonly used clustering algorithm. It converges fairly quickly, but achieving a good solution is not guaranteed. The clustering quality is highly dependent on the selection of the initial centroid selections. Moreover, when the number of clusters increases, it starts to suffer from...

chapter

Performance assessment and uncertainty quantification of predictive models for smart manufacturing systems

Luca Oneto, Ilenia Orlandi, Davide Anguita

2015 IEEE International Conference on Big Data (Big Data) > 1436 - 1445

2015 IEEE International Conference on Big Data (Big Data)

We review in this paper several methods from Statistical Learning Theory (SLT) for the performance assessment and uncertainty quantification of predictive models. Computational issues are addressed so to allow the scaling to large datasets and the application of SLT to Big Data analytics. The effectiveness of the application of SLT to manufacturing systems is exemplified by targeting the derivation...

chapter

Toward locality-aware scheduling for containerized cloud services

Dongfang Zhao, Nagapramod Mandagere, Gabriel Alatorre, Mohamed Mohamed, more

2015 IEEE International Conference on Big Data (Big Data) > 263 - 270

2015 IEEE International Conference on Big Data (Big Data)

The state-of-the-art scheduler of containerized cloud services considers load-balance as the only criterion and neglects many others such as application performance. In the era of Big Data, however, applications have evolved to be highly data-intensive thus perform poorly in existing systems. This particularly holds for Platform-as-a-Service environments that encourage an application model of stateless...

chapter

Composable and efficient functional big data processing framework

Dongyao Wu, Sherif Sakr, Liming Zhu, Qinghua Lu

2015 IEEE International Conference on Big Data (Big Data) > 279 - 286

2015 IEEE International Conference on Big Data (Big Data)

Over the past years, frameworks such as MapRe-duce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides,...

chapter

How to make money from your information and keep your privacy

Divya Rao, Wee Keong Ng

2015 IEEE International Conference on Big Data (Big Data) > 2859 - 2861

2015 IEEE International Conference on Big Data (Big Data)

Today big data is synonymous with every business and organization, so much so that data brokers have made a business of trading this big data like any other commodity. In turn, the buyers of this big data make massive profits. The only one who loses out on profits and his privacy is the internet user — the generator and owner of this big data. Our work looks at allowing the user to monetize on his...

chapter

Data confidentiality challenges in big data applications

Jian Yin, Dongfang Zhao

2015 IEEE International Conference on Big Data (Big Data) > 2886 - 2888

2015 IEEE International Conference on Big Data (Big Data)

In this paper, we address the problem of data confidentiality in big data analytics. In many fields, much useful patterns can be extracted by applying machine learning techniques to big data. However, data confidentiality must be protected. In many scenarios, data confidentiality could well be a prerequisite for data to be shared. We present a scheme to provide provable secure data confidentiality...

chapter

Online pattern mining for high-dimensional data streams

Yoshitaka Yamamoto, Koji Iwanuma

2015 IEEE International Conference on Big Data (Big Data) > 2880 - 2882

2015 IEEE International Conference on Big Data (Big Data)

This paper studies one-scan approximation algorithms for streaming data mining (SDM). Despite of the importance of pattern discovery in streaming data, this issue has not sufficiently addressed yet in the big data community. In this context, we briefly review the previously proposed SDM methods. There is a recent work to improve their limitation using the tecnique of online compression. It is based...

chapter

Scheduling of Big Data application workflows in cloud and inter-cloud environments

B. Kezia Rani, A. Vinaya Babu

2015 IEEE International Conference on Big Data (Big Data) > 2862 - 2864

2015 IEEE International Conference on Big Data (Big Data)

Large amount of data is being generated every day and is creating new challenges and opportunities which lead to extraordinary new knowledge and discoveries in many application domains ranging from science and engineering to business. One of the main challenges in this era of Big Data is how to efficiently manage and analyse such scale of data. This is challenging not only due to the size of the data,...

chapter

Detecting rumor patterns in streaming social media

Shihan Wang, Takao Terano

2015 IEEE International Conference on Big Data (Big Data) > 2709 - 2715

2015 IEEE International Conference on Big Data (Big Data)

Rumor detection in streaming social media is a significant but challenging problem. In this paper, we present a method to identify rumor patterns in the streaming social media environment. Patterns which combine both structural and behavioral properties of rumor are firstly proposed to distinguish false rumors from valid news. A novel graph-based pattern matching algorithm is also described to detect...

INFONA - science communication portal

2015 IEEE International Conference on Big Data (Big Data)

Hotspots of news articles: Joint mining of news text & social media to discover controversial points in news

Improving the quality of semantic relationships extracted from massive user behavioral data

Integrating semantic knowledge into Tag-LDA model through cloud model

Using probabilistic approach to joint clustering and statistical inference: Analytics for big investment data

Using Word2Vec to process big text data

Inferring bike trip patterns from bike sharing system open data

Finding banded patterns in big data using sampling

Scalable preference queries for high-dimensional data using map-reduce

Towards a taxonomy of standards in smart data

Mixed-initiative social media analytics at the World Bank: Observations of citizen sentiment in Twitter data to explore "trust" of political actors and state institutions and its relationship to social protest

Data deidentification in medical transcriptions using regular expressions and machine learning

High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm

Performance assessment and uncertainty quantification of predictive models for smart manufacturing systems

Toward locality-aware scheduling for containerized cloud services

Composable and efficient functional big data processing framework

How to make money from your information and keep your privacy

Data confidentiality challenges in big data applications

Online pattern mining for high-dimensional data streams

Scheduling of Big Data application workflows in cloud and inter-cloud environments

Detecting rumor patterns in streaming social media

Filter options

Publication date

Keywords

INFONA - science communication portal

2015 IEEE International Conference on Big Data (Big Data) $("#expandableTitles").expandable();

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options

2015 IEEE International Conference on Big Data (Big Data)